{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Bulk data access\n",
    "\n",
    "This tutorial explains how to retrieve full tables from the database into [pandas DataFrames](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).\n",
    "\n",
    "## Available tables in ``mendeleev``\n",
    "\n",
    "- [elements](#elements): The main table with element properties\n",
    "- [groups](#groups): Metadata about periodic table groups\n",
    "- [ionicradii](#ionicradii): Ionic radii data\n",
    "- [ionizationenergies](#ionizationenergies): Ionization energies\n",
    "- [isotopes](#isotopes): Isotopes and their properties\n",
    "- [isotopedecaymodes](#isotopedecaymodes): Data on isotope decay modes\n",
    "- [oxidationstates](#oxidationstates): Oxidation states for elements\n",
    "- [phasetransitions](#phasetransitions): Phase transition data for elements\n",
    "- [propertymetadata](#propertymetadata): Metadata about properties from all tables\n",
    "- [scattering_factors](#scattering_factors): Atomic scattering factors\n",
    "- [screeningconstants](#screeningconstants): Nuclear screening constants\n",
    "- [series](#series): Metadata about periodic table series\n",
    "\n",
    "All data is stored in a sqlite database that is shipped together with the package. You can interact directly with the database if you need more flexibility but for convenience ``mendeleev`` provides a few functions in the `fetch` module to\n",
    "retrieve data.\n",
    "\n",
    "## Fetching computed properties in bulk\n",
    "\n",
    "- [Electronegativities](#Electronegativities) A collection of [electronegativity](https://mendeleev.readthedocs.io/en/stable/electronegativity.html) scales computed based on other properties."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To fetch whole tables you can use `fetch_table`. The function can be imported from `mendeleev.fetch`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "from mendeleev.fetch import fetch_table"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To retrieve a table call the ``fetch_table`` with the table name as argument. Here we'll get probably the most important table ``elements`` with basis data on each element"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## elements"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "ptable = fetch_table(\"elements\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can use [pandas'](http://pandas.pydata.org) capabilities to work with the data. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "ptable.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For clarity let's take only a subset of columns "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cols = [\n",
    "    \"atomic_number\",\n",
    "    \"symbol\",\n",
    "    \"atomic_radius\",\n",
    "    \"en_pauling\",\n",
    "    \"block\",\n",
    "    \"vdw_radius_mm3\",\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "ptable[cols].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is quite easy now to get descriptive statistics on the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "ptable[cols].describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## groups\n",
    "\n",
    "Periodic table group metadata."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "groups = fetch_table(\"groups\")\n",
    "groups"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ionicradii\n",
    "\n",
    "The function to fetch ionic radii is called `fetch_ionic_radii` and can either fetch ionic or crystal radii depending on the `radius` argument."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mendeleev.fetch import fetch_ionic_radii"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "irs = fetch_ionic_radii(radius=\"ionic_radius\")\n",
    "irs.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ionizationenergies\n",
    "\n",
    "To fetch ionization energies use `fetch_ionization_energies` that takes a `degree` (default is `degree=1`) argument that can either be a single integer or a list if integers to fetch multiple ionization energies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mendeleev.fetch import fetch_ionization_energies"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ies = fetch_ionization_energies(degree=2)\n",
    "ies.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To fetch multiple ionization degress at once simply pass a list of degrees:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ies_multiple = fetch_ionization_energies(degree=[1, 3, 5])\n",
    "ies_multiple.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## isotopes\n",
    "\n",
    "Let try and retrieve another table, namely ``isotopes``"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "isotopes = fetch_table(\"isotopes\", index_col=\"id\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "isotopes.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's a few first records from that dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "isotopes.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Merge the elements table with the isotopes\n",
    "\n",
    "We can now perform SQL-like merge operation on two ``DataFrame``s and produce an [outer](http://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging) join "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "merged = pd.merge(ptable[cols], isotopes, how=\"outer\", on=\"atomic_number\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "now we have the following columns in the ``merged`` ``DataFrame``"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "merged.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "merged.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To display all the isotopes of Silicon"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "merged[merged[\"symbol\"] == \"Si\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## isotopedecaymodes\n",
    "\n",
    "Different modes are explained in [the documentation](https://mendeleev.readthedocs.io/en/stable/data.html#isotope-decay-modes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "idm = fetch_table(\"isotopedecaymodes\")\n",
    "idm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## oxidationstates"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ox = fetch_table(\"oxidationstates\")\n",
    "ox"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## phasetransitions\n",
    "\n",
    "For most elements a single row is present with phase transition properies, however for elements with multiple allotropes (i.e. Carbon, Phosphorus, Selenium, Sulfur, Tin) values are provided for each allotrope."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pt = fetch_table(\"phasetransitions\")\n",
    "pt.sort_values(by=\"atomic_number\").head(15)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## propertymetadata\n",
    "\n",
    "This table holds a lot of the metadata that is used to render the data documentation page. Probably the most useful piece of information here is the unit assocaited with a given property. The units are compatible with the [pint](https://pint.readthedocs.io/en/stable/) package so properties might be used direclty with units."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "metadata = fetch_table(\"propertymetadata\")\n",
    "metadata.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## scattering_factors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "scattering_factors = fetch_table(\"scattering_factors\")\n",
    "scattering_factors.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## screeningconstants"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "screening = fetch_table(\"screeningconstants\")\n",
    "screening.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## series\n",
    "\n",
    "Metadata about periodic table series."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "series = fetch_table(\"series\")\n",
    "series"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Electronegativities\n",
    "\n",
    "To fetch all data from electronegatuivity scales use `fetch_electronegativities`. This can take a few seconds since most of the values need to be computed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mendeleev.fetch import fetch_electronegativities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ens = fetch_electronegativities()\n",
    "ens.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Version information"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import mendeleev\n",
    "\n",
    "print(f\"{mendeleev.__version__=}\")"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
